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ABSTRACT 

Successful halo-model descriptions of the luminosity dependence of clustering distin- 
guish between the central galaxy in a halo and all the others (satellites). To include 
colors, we provide a prescription for how the color-magnitude relation of centrals and 
satellites depends on halo mass. This follows from two assumptions: (i) the bimodality 
of the color distribution at fixed luminosity is independent of halo mass, and (ii) the 
fraction of satellite galaxies which populate the red sequence increases with luminosity. 
We show that these two assumptions allow one to build a model of how galaxy clus- 
tering depends on color without any additional free parameters than those required to 
model the luminosity dependence of galaxy clustering. We then show that the resulting 
model is in good agreement with the distribution and clustering of colors in the SDSS, 
both by comparing the predicted correlation functions of red and blue galaxies with 
measurements, and by comparing the predicted color mark correlation function with 
the measured one. Mark correlation functions are powerful tools for identifying and 
quantifying correlations between galaxy properties and their environments: our results 
indicate that the correlation between halo mass and environment is the primary driver 
for correlations between galaxy colors and the environment; additional correlations as- 
sociated with halo 'assembly bias' are relatively small. Our approach shows explicitly 
how to construct mock catalogs which include both luminosities and colors — thus 
providing realistic training sets for, e.g., galaxy cluster finding algorithms. Our pre- 
scription is the first step towards incorporating the entire spectral energy distribution 
into the halo model approach. 

Key words: methods: analytical - methods: statistical - galaxies: formation - galaxies: 
evolution - galaxies: clustering - galaxies: halos - dark matter - large scale structure 
of the universe 



1 INTRODUCTION 

The halo model is a useful language for discussing how 
galaxy clustering depends on galaxy type: galajcy bias (see 
Cooray & Sheth 2002 for a review). To date, the halo model 
has been used to provide a useful framework for modeling 
the luminosity dependence of galaxy clustering. The main 
goal of this paper is to extend the halo model description of 
galaxy luminosities to include colors. This is an important 
step towards the ultimate goal of providing a description of 
how the properties of a galaxy, its morphology and spectral 
energy distribution, are correlated with those of its neigh- 
bors. The hope is that, by relating such correlations between 
galax;ies to the properties of their parent dark matter halos, 
the halo model will provide a useful guide in the study of 
galaxy formation. 

* E-mail: skibba@mpia.de (RAS); shethrk@physics.upenn.edu 
(RKS) 



The halo model description of the luminosity depen- 
dence of clustering is usually done in three rather different 
ways, which have come to be known as the 'halo occupa- 
tion distribution' (HOD; Jing, Mo & Borner 1998; Benson 
et al. 2000; Seljak 2000; Scoccimarro et al. 2001; Berlind & 
Weinberg 2002; Zehavi et al. 2005) the 'conditional lumi- 
nosity function' (CLF; Peacock & Smith 2000; Yang et al. 
2003; Cooray 2006; van den Bosch et al. 2007a), and the 
'subhalo abundance matching' (SHAM; Klypin et al. 1999; 
Kravtsov et al. 2004; Vale & Ostriker 2006; Conroy, Wech- 
sler & Kravtsov 2006) methods. The HOD approach uses 
the abundance and spatial distribution of a given galaxy 
population (typically, just the two-point clustering statis- 
tics) to determine how the number of galaxies depends on 
the mass of the parent halo. This is done by studying a 
sequence of volume limited galaxy catalogs, each contain- 
ing galaxies more luminous than some threshold luminosity. 
The CLF method attempts, instead, to match the observed 
luminosity function by specifying how the luminosity dis- 
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tribution in halos changes as a function of halo mass. One 
can infer the CLF from the HOD approach, and vice- versa, 
so the question arises as to which is the more efficient de- 
scription. For a given catalog, the HOD method requires the 
fitting of just two free parameters, so it is relatively straight- 
forward. The CLF method requires many more parameters 
to be fit simultaneously, but uses fewer volume limited cata- 
logs. SHAMs first identify the subhalos within virialized ha- 
los in simulations, and then use subhalo properties to match 
the subhalo abundances to the observed distribution of lu- 
minosities. Once this has been done, CLFs or HODs can be 
measured in the simulations. 

In SPH and semi-analytic galaxy formation models, cen- 
tral and satellite galaxies are rather different populations 
{e.g., Kaufltmann et al. 1999; Sheth & Diaferio 2001; Guzik 
& Seljak 2002; Benson et al. 2003; Sheth 2005; Zheng et al. 
2005). And so too, in the HOD and CLF approaches to the 
halo model, the central galaxy in a halo is assumed to be 
very different all the others, which are called satellites. For 
example, the CLF approach must provide a description of 
how the central and satellite luminosity functions vary as a 
function of halo mass. The HOD-based analyses predict that 
the satellite galaxy luminosity function should be approxi- 
mately independent of halo mass, and hence of group and/or 
cluster properties (Skibba et al. 2006). Skibba et al. (2007) 
present evidence from the SDSS in support of this predic- 
tion. More recent analysis of a rather difi'erent group catalog 
has confirmed this finding (Hansen et al. 2008). Skibba et 
al. argued that this independence can reduce the required 
number of free parameters in CLF-based analyses. 

One of the goals of the present work is to show that 
the HOD-based approach also provides a rather simple way 
to understand how galaxy clustering depends on color. In 
essence, it provides a simple algorithm for specifying how 
the joint CLF {i.e., the luminosity distribution in two dif- 
ferent bands) varies with halo mass. In principle, this can be 
done by splitting the sample up into small bins of luminos- 
ity and color, and studying how the clustering signal in each 
bin changes. Zehavi et al. (2005) describe a first attempt at 
this - for each bin in luminosity, they use two bins in color: 
'red' or 'blue'. (Croton et al. 2007 also study the difference 
in clustering strengths of red and blue galaxies. They use 
related statistics, but do not attempt a halo-model descrip- 
tion of their measurements.) As sample sizes increase, it will 
become possible to split the sample into many more color 
bins. However, even for this simplest case, Zehavi et al. were 
led to a rather more complex parametrization of the HOD 
than was necessary for the luminosities - they caution that, 
as a result, there are more degeneracies amongst their pa- 
rameter choices, and so the constraints on the HODs they 
obtain are considerably weaker than for luminosities alone. 
While such a brute force approach to determining the HOD 
is certainly possible, we argue below that there may be some 
merit to recasting the problem as one in which the physics 
and statistics are more closely related. 

In essence, our approach exploits the fact that, to a 
good approximation, galaxies appear to be bimodal in their 
properties {e.g., Blanton et al. 2003). In the present context, 
we are interested in the fact that the distribution of colors at 
fixed luminosity is bimodal {e.g., Baldry et al. 2004; Willmer 
et al. 2006). Our approach is to couple this bimodality with 
the centre-satellite split in the halo model. 
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Figure 1. Bimodal distribution of g—r color in the SDSS. Smooth 
curves show that, at fixed luminosity, the distribution is well mod- 
eled by the sum of two Gaussian components. 



This paper is organized as follows. Section [2] describes 
our approach: it shows the correlation between color and lu- 
minosity in the SDSS sample, and then describes a model for 
the luminosities and colors of centrals and satellites which 
is designed to reproduce this bimodality. SectionOdescribes 
how to use our model to generate mock catalogs which have 
the correct luminosity dependence of clustering and the ob- 
served color-magnitude relation, as well as how to incor- 
porate our approach into a halo model description of the 
color-mark two-point correlation function. Section |4] pro- 
vides a comparison of our model predictions with measure- 
ments from the SDSS. These include the clustering signal 
from 'red' and 'blue' galaxies (defined as being redder or 
bluer than a critical luminosity dependent color) and the 
clustering signal when galaxies are weighted by color - the 
color-mark correlation function. A final section summarizes 
our findings. 

Throughout, the restframe magnitudes we quote are as- 
sociated with SDSS filters shifted to z — 0.1; the absolute 
magnitude of the Sun in this r-band filter is 4.76 (Blanton 
et al. 2003). Where necessary, we assume a flat background 
cosmological model in which Qo = 0.3, the cosmological con- 
stant is Ao = 1 — flo, and erg = 0.9. We write the Hubble 
constant as Ho = lOO/i km s"'^ Mpc~^. In addition, we al- 
ways use 'log' for the 10-based logarithm and 'In' for the 
natural logarithm. 



2 COLOR-MAGNITUDE BIMODALITY AND 
THE CENTRE-SATELLITE SPLIT 

2.1 Bimodality in the SDSS 

Baldry et al. (2004) report that the distribution of rest-frame 
It — r color at fixed r-magnitude can be well-modeled as the 
sum of two Gaussian components. The same is true of the 
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Figure 2. Color-magnitude diagram in the Mr < —19.5 volume- 
limited SDSS catalog. Solid lines show the mean values of the red 
and blue sequences (equations [T] and [2]l ; dashed line shows the 
satellite sequence (equation (Tjl, and dotted line shows equation (|4]l 
which some authors use to divide the population into red and 
blue. 

distribution of rest-frame g — r color {e.g., Blanton et al. 
2005); we call these the red and blue components of the 
distribution p{c\L). The mean and rms values of these com- 
ponents depend on luminosity. This dependence is quite well 
described by simple power laws: 

{g-r\Mr) = 0.932 - 0.032 (M^-f 20), 

\ / red 

rms(g-r\Mr\ = 0.07 + 0.01 (A/,. + 20); (1) 

V / red 

(g-r\Mr) = 0.62 -0.11 (Mr +20), 

\ / blue 

Tms(g-r\Mr) = 0.12 ~f 0.02 (M^ -f 20). (2) 

V / blue 

The fraction of objects in the blue component decreases with 
increasing luminosity: 

/biuc(MO ^ 0.46 + 0.07 (Mr + 20), (3) 

and drops toward zero at the bright end. 

Figure [1] shows this bimodality, and the two Gaussian 
component fits which are based on these expressions. Our 
model of the bimodality, which motivates an algorithm for 
constructing mock catalogs, and which our halo model cal- 
culation requires, uses the red and blue sequences given by 
equations ^ and ((2]). These sequences are also shown in 
a color-magnitude diagram. Figure (2] along with the color- 
magnitude contours of one of the volume-limited SDSS cat- 
alogs used in Section |4l 

However, it is common to make a cruder approximation 
to this bimodality, by simply labeling galaxies as 'red' if they 
are redder than 

- Ocut = 0.8 - 0.03 C-'Mr + 20), (4) 
and calling them 'blue' otherwise {e.g., Zehavi et al. 2005; 



Blanton & Berlind 2007). (The recent analysis of satellite 
galaxy colors by van den Bosch et al. 2007b used a stellar 
mass-based split, which translates into a similar color cut as 
the one above, although their cut is slightly steeper with re- 
spect to r-band luminosity.) In what follows, we will only use 
this sharp threshold when comparing our results to previous 
work. 

The SDSS colors (and magnitudes) have measurement 
errors which contribute to the rms of the red and blue se- 
quences, especially at faint magnitudes. However, the uncer- 
tainties in the g — r galaxy colors in the SDSS are typically 
less than 0.02 mags, so they are unlikely to significantly af- 
fect the constraints on the model. Since the measurement 
errors almost certainly do not correlate with environment, 
they are not expected to bias the measured color mark cor- 
relation functions shown in Section |4l they will, however, 
increase the error bars on the clustering signal. We note, 
however, that there is an important systematic problem with 
the colors for which we do not correct: namely, a dusty spi- 
ral will appear redder if viewed edge-on rather than face-on. 
In fact, a significant fraction of the objects called 'red' are 
not the early-types which one typically associates with the 
'red sequence' (Bernardi et al. 2003). MitcheU et al. (2005) 
estimate that this fraction is of order 40% (also see Mailer 
et al. 2008). Since this systematic also affects the luminosi- 
ties, for which no halo-model analysis to date has yet made 
a correction, we have not done so here either. 

2.2 Luminosities and colors of centrals and 
satellites 

To illustrate our approach we will begin with an extreme 
assumption. Suppose that: (i) the bimodal color distribution 
is independent of halo mass (by which we mean that the 
distribution of color at fixed luminosity is independent of 
halo mass; the distribution of luminosities, of course, does 
depend on halo mass), and that (ii) satellites are drawn from 
the red part of the bimodal color distribution - no satellites 
come from the blue sequence. Later in this paper, we will 
find it necessary to relax the second assumption, but the 
data does not yet require us to give up the first. We think 
assumption (ii) is a useful extreme which helps bring into 
focus the key points of the approach. 

Given the constraints from the color distribution as a 
function of luminosity (Section I2.1|l and from luminosity- 
dependent clustering ( Appendix these two assumptions 
allow one to model the halo mass dependence of the col- 
ors of both centrals and satellites, and in general to build 
a model of how galaxy clustering depends on color, without 
any additional free parameters. For example, these assump- 
tions imply that the mean satellite color is 

(cjm) = J dcp{c\m)c — J dLp{L\m) j dcp{c\L,m)c 

= j dLp{L\m){c\L,m). (5) 

Whereas the first equality is the definition, the final expres- 
sion shows how one might estimate the left-hand side from 
a knowledge of the luminosity distribution in halos of mass 
m and the mean color at given luminosity in such halos. 

If the distribution of satellite colors at fixed satellite lu- 
minosity is independent of halo mass (this is not unreason- 
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able, given that the distribution of luminosities themselves 
is approximately independent of halo mass; see Skibba et al. 
2006, 2007; Hansen et al. 2008), then this becomes 



(c|m)sat = / dLpss,t{L\m) {c\L)s 



(6) 



Thus, given m, we integrate over the distribution of satellite 
luminosities, weighting by (c|I/)sat. 

Our simplest model (assumption ii) uses equation ([l]), 
the color magnitude relation along the red sequence, for 
(c|L)sat. We will show later that setting 

{g - rjM,)3at = 0.83 - 0.08 (Af, + 20) (7) 

instead, which is bluer at faint luminosities (see Figure [2]), 
provides substantially better agreement with the observa- 
tions. This is best thought of as a model in which satellites 
are drawn from the red sequence with probability 

{c|L)sat - (clL)bluc 



p(red sat|i) = , , . , , . 

and from the blue sequence with probability 
p(blue satjL) = 1 — p(red sat|L). 



(8) 



where Psat(< Mr\m) = Psat(> L\m), and Caat, siopo is the 
slope of the relation showing how the mean satellite color 
changes with magnitude. That is, Csat, slope = —0.032 or 
—0.08 if satellites are drawn from the red sequence (c.f. equa- 
tion [T| or from equation ([7]). 

Obtaining an expression for the typical color associated 
with the central galaxies of m halos is more complicated. Al- 
though the bimodal distribution of color at fixed luminosity 
can be thought of as arising from a mix of objects which lie 
along a blue or a red sequence, in what follows, it will be 
more useful to think in terms of the central-satellite split. 
In this case, 



(9) 

These expressions imply that, for SDSS g — r colors, 
p(blue sat|L) « 0.4 at Mr = —18, and it drops to zero at 
Mr ~ —22. Since the fraction of galaxies that are satellites 
has a similar dependence on luminosity (we provide explicit 
HOD-derived expressions for this later) , this model says that 
although almost sixty percent of the galaxies with Mr = — 18 
are from the blue sequence (c.f. equation |3]), slightly less 
than twenty percent of the galaxies with Mr = — 18 are blue 
satellites: only a third of the faint blue galaxies are satellites, 
the others are centrals. Allowing for blue-sequence satellites 
modifies the discussion below trivially. 

It is worth reiterating that, in this model, satellite colors 
only depend on halo mass because satellite luminosities do. 
Since psat(i|»Ti) depends only weakly on m (Skibba et al. 
2007), we expect (c|m)sat to also depend only weakly on m. 

In practice, we do not evaluate the integral in equa- 
tion ^ as written. Rather, we use a variation of the trick 
we used in Skibba et al. (2006). Namely, for some function 
C{L) of L, 

dLC{L) / dL'p{L'\m)^ / dV p{L'\m) / dLC{L) 

(10) 

Skibba et al. studied the case where C{L) = 1, so the inner 
integral gave L' — Lmin- Here, we wish to set C{L) to be 
that function of L which, when integrated over L, yields 

red • 

Thus, 

/OO 
dLC{L)Ps^t{> L\m), (11) 

where we have defined 

D T\ ^- f°° (T\ \ A^sat(>L|m) 

Psat(> L\m) = / dLps^t{L\m) = ' ' / . (12) 

Jl Asat(> Lmin\m) 

If color and luminosity are in magnitudes (i.e., we work in 
logarithmic rather than linear variables) then the integral is 
simpler: 



{g - r|m)s 



{g — r|i\fmin)sat + 
C'sat , slope 

/ dM,.Psat(< M,.lm), (13) 



{c\L) 



making 



iVeen(L){c|L)een + iVsat (L) {c| L) s 
iVeen(L)+Arsat(L) 



(14) 



(c|L)ee„ = {C\L) + -j^^^ [{C\L) - {c|L)sat] . (15) 

If, as we assumed for the satellites, the distribution of central 
galaxy colors at fixed luminosity is independent of halo mass 
(the results of Berlind et al. 2005 support this assumption) , 
then the mean color as a function of halo mass is simply 
{c|m)cen = {c\L{m))cc-n if there is no scatter between central 
galaxy luminosity and halo mass {e.g., Zehavi et al. 2005). 
If there is scatter (e.g., Zheng et al. 2007), then 



{c|m)cc 



dLPccniMm) {c\L), 



(16) 



Now, by hypothesis, {c|L)sat is given by equation ((TJ 
(or equation [7]), whereas {c\L) is simply the mean color of 
all galaxies as a function of luminosity. Thus, both these 
quantities are observables, or are constrained by observ- 
ables, for the satellites (Skibba 2008); the only unknown 
is Ns!it(L)/Nccn{L). Since both numbers are counted in the 
same volume, this is the same as the ratio of the number 
densities: nsat(i)/?icon(i). We discuss how this ratio is de- 
termined by the luminosity-based HOD in Appendix 1X1 

It is worth noting that the quantity in square brackets 
in equation (|15p is negative. This means that, in general, the 
colors of central galaxies are bluer than the average for their 
luminosities. Although this seems counter to intuition — one 
is used to thinking of central galaxies as being red — it is, in 
fact, sensible. Essentially, the paradox is resolved when one 
realizes that the satellites actually inhabit more massive ha- 
los than do centrals of the same luminosity. It may help to 
note that this effect is most pronounced at low L, where the 
mean color is significantly bluer than the red sequence, and 
the number of satellites can be large. Low luminosity galax- 
ies that are centrals are hosted by low mass halos, whereas 
satellite galaxies of similar luminosity are more likely to re- 
side in groups or clusters, so their parent halos are more 
massive. Thus, our model heis placed blue central galaxies 
in low mass halos and red satellite galaxies in massive ha- 
los. At higher luminosities, (cjL) approaches that of the red 
sequence. In this limit, the term in square brackets becomes 
small, as does the number of satellites, so the colors of cen- 
tral galaxies tend to (c\L) : that is, our model places luminous 
central galaxies on the red sequence. 
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3 TWO WAYS TO TEST THE MODEL OF 
BIMODALITY 

We now describe two ways to test our model of the bimodal- 
ity. The first is numerical — we provide an algorithm for 
constructing mock catalogs which are consistent with our 
model. The model can be tested by performing the same 
analysis on the mocks that was performed on real data. 
This is particularly useful for analyses which are somewhat 
involved or contrived, so that an analytic description is dif- 
ficult. The second is analytic — we show how our model 
can be implemented to provide a halo model description of 
mark correlations when the mark is color. Skibba (2008) de- 
scribes the result of a third test: a direct measurement of 
central and satellite colors in group catalogs. 



3.1 An algorithm for constructing mock catalogs 
with luminosities and colors 

The analysis above shows that one can generate a mock 
galaxy catalog in two steps: first generate luminosities, and 
then use them to generate colors. Note that the method used 
for generating luminosities is noi important: the luminosities 
could have come from an HOD analysis, a CLF analysis, or 
they may be based on a SHAM. 

Our algorithm for generating luminosities comes from 
Skibba et al. (2006). Briefly, we specify a minimum lumi- 
nosity Lmin which is smaller than the minimum luminos- 
ity we wish to study. We then select the subset of halos in 
the simulation which have m > mniin(imin). Each halo is 
assigned a central galaxy with luminosity given by invert- 
ing the relation between halo mass and luminosity (equa- 
tion [X2|. We specify the number of satellites the halo con- 
tains by choosing an integer from a Poisson distribution 
with mean Ns!it{> Lminlm). The luminosity of each satel- 
lite galaxy is specified by generating a random number uq 
distributed uniformly between and 1, and finding that L 
for which A'sat(> L|m)/A^sat(> Lmin\m) = uq. This ensures 
that the satellites have the correct luminosity distribution. 

We could assign colors to each of the satellites by draw- 
ing a Gaussian random number with mean and rms given 
by inserting the satellite luminosity in equation Q for the 
red sequence. However, as we show in Section [l] this results 
in a correlation between color and environment that is too 
strong compared to the data. Instead, we want the satel- 
lites to have colors which are bluer than the red sequence at 
faint luminosities, as specified by equation ([7]). To implement 
this in our mock catalog, we draw a uniformly distributed 
random number < ui < 1. The satellite is drawn from 
the red sequence (a Gaussian with mean and rms given by 
equation [T| if ui < p(red sat|I/), where p(red sat|L) is given 
by equation ([8]) and from the blue sequence (Gaussian with 
mean and rms from equation [Sjl otherwise. Note that only 
the luminosity matters for determining the color; the halo 
mass plays no additional role. 

The colors for central galaxies can also be drawn from 
either the red or blue sequence. To determine which, we draw 
another uniformly distributed random number U2. If ii2 > 
/biue(i)//cen(i), where L is the central object's luminosity, 
then the object is assigned to the red sequence, so we draw a 
Gaussian with luminosity-dependent mean and rms given by 
equation ([l}. Else, it is blue, and we use equation ((2)| instead. 
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Figure 3. Bimodal distribution of g — r color for central galax- 
ies (red histogram), satellite galaxies (blue dashed histogram), 
and all galaxies (centrals-|-satellites; black dotted histogram) in 
a mock catalog with Mr < —20.5. The distributions are shown 
for four intervals in log halo mass, indicated in square brackets in 
each panel. 



Equations ([3]) and (IA8|l show that this assigns all central 
galaxies fainter than Mr ~ —18.5 to the blue sequence. 

Finally, we place the central galaxy at the center of its 
halo, and distribute the satellites around it so that they 
follow an NEW profile (see Scoccimarro & Sheth 2002 for 
how this can be done efficiently) . The resulting mock galaxy 
catalog has been constructed to have the correct luminos- 
ity function as well as the correct luminosity dependence of 
the galaxy two-point correlation function. In addition, col- 
ors in this catalog are assigned in accordance with the model 
described previously: satellite and central galaxy colors are 
assigned such that the galaxy population as a whole has the 
correct color-luminosity distribution. 

Our model makes a prediction for how the bimodal- 
ity in color differs for central and satellite galaxies. In Fig- 
ure O we show the color distribution as a function of halo 
mass of central and satellite galaxies in a mock catalog with 
Mr < —20.5. We normalize the central and satellite galaxy 
distributions by the total number of galaxies in each bin; 
consequently, the lower mass halos are dominated by cen- 
tral galaxies, while satellites contribute most of the galax- 
ies in massive halos. First, note that the satellite distri- 
bution is almost the same in each panel: this is a conse- 
quence of our assumption that the distribution of satellite 
colors at fixed luminosity is independent of halo mass (i.e., 
Psat(cjL,m) — psat(c|L)), aud the fact that satellite lumi- 
nosities are approximately independent of mass as well. On 
the other hand, the centrals have a more bimodal distribu- 
tion in low-mass halos, while in massive halos most of them 
are on the red sequence. Second, it is interesting that the 
blue and red modes of the central galaxy bimodal color dis- 
tribution are closer together than those of the satellite color 
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distribution, such that the blue bump of the centrals tends 
to peak at the minimum in the satellite distribution. 

3.2 Implicit assumptions, bells and whistles 

This halo-model based prescription for making mock cat- 
alogs uses three simplifying assumptions which are worth 
discussing explicitly. First, although we assume halos are 
spherical and smooth, the density run of satellites around 
halo centers is almost certainly neither. Generating triaxial 
distributions is straightforward once prescriptions for how 
the triaxiality depends on halo mass and how it correlates 
with environment are available. Once these are known, they 
can be incorporated into the analytic halo-model description 
(Smith, Watts & Sheth 2006). Similarly, parametrizations 
of halo substructure can also be incorporated into the de- 
scription (Sheth & Jain 2003). Of course, both these types 
of correlations can be included in the mock catalog directly 
from a simulation if one simply selects the appropriate num- 
ber of particles from the halo itself, rather than generating 
the profile shape synthetically. This is costly because now 
one needs the full particle distribution, rather than just the 
halo catalog, to generate the mock - but note that it is not 
a problem of principle. 

Second, note that the number of galaxies in a halo, the 
spatial distribution of galaxies within a halo, and the assign- 
ment of luminosities all depend only on halo mass. None 
of these depend on the surrounding large-scale structure. 
Therefore, the mock catalog includes only those environmen- 
tal effects which arise from the environmental dependence 
of halo abundances. This point was made by Skibba et al. 
(2006); it is also true of our prescription for including colors. 

Third, halos of the same mass will have had a variety 
of formation histories. Some will have assembled their mass 
and their galaxy populations more recently than others. Re- 
cent assembly means less time for dynamical friction, and, 
possibly, a younger stellar population. So, at fixed halo mass, 
one might expect to find a correlation between the age of a 
halo and the galaxy population within it. In particular, the 
number of galaxies in a halo, their luminosities and their 
colors may all be correlated with the formation history. Our 
halo model description (and associated mock catalog) ig- 
nores all such correlations. To see this clearly, note that we 
assign luminosities and colors to the galaxies in a halo with- 
out regard for the number of galaxies in it. Had we used a 
SHAM to assign luminosities, then some of correlation be- 
tween formation history and the galaxy population will have 
been included. If one is already carrying along the particle 
distribution from the simulation to construct the mock, then 
the next level of complication is to also include additional 
information about the merger history in the simulation, for 
use when making the mock. 

We also assign colors to satellite galaxies without ex- 
plicit consideration of the color of the central galaxy, and 
we make no effort to incorporate color gradients within a 
halo into our model. This is mainly because the two-point 
statistics we study in this paper, weighted or unweighted, 
are known to be not very sensitive to gradients (see Sheth 
et al. 2001, Scranton 2002, Sheth 2005 and Skibba 2008 for 
more discussion and simple prescriptions for incorporating 
color gradients.) 

These are all interesting problems for the future (and 
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Figure 4. Luminosity (top) and g — r color (bottom) mark cor- 
relation functions in a real-space mock catalog in which Air < 
—20.5. Solid curves show the halo model predictions. 



they are almost certainly not independent problems!), but 
the measurements described in the next section do not re- 
quire these refinements. 



3.3 A halo model description of color mark 
correlations 

Mark correlations are an efficient way to quantify the cor- 
relation between the properties of galaxies and their envi- 
ronment (Sheth, Connolly & Skibba 2005). The two-point 
mark correlation function is simply 



M{r) 



1 + W{r) 
1 + ' 



(17) 



where ^(r) is the traditional two-point correlation function 
and W{r) is the same sum over galaxy pairs separated by 
r, but now each member of the pair is weighted by the ra- 
tio of its mark to the mean mark of all the galaxies in the 
catalog {e.g., Stoyan & Stoyan 1994; Beisbart & Kerscher 
2000). In effect, the denominator divides-out the contribu- 
tion to the weighted correlation function which comes from 
the spatial contribution of the points, leaving only the con- 
tribution from the fluctuations of the marks. 

In models where a galaxy's properties correlate with en- 
vironment only because they correlate with host halo mass, 
but halo abundances correlate with environment, it is rela- 
tively straightforward to write down a halo model of mark 
correlations (Sheth 2005). Since our model of central and 
satellite colors is precisely of this form, we can build a halo 
model of color-mark correlations. Appendix B provides a 
detailed description of how this is done. In principle, com- 
parison of this prediction with measurements in the SDSS 
dataset allow a test of our approach. 

Before performing this test with data. Figure |4] shows 
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a comparison with measurements in the mock catalog de- 
scribed in the previous section. The halo population is from 
the VLS simulation (Yoshida et al. 2001), and the mock 
galaxies have Mr < —20.5. Luminosities and colors were 
assigned as described above: the top panel shows M(r) as 
a function of real-space separation when luminosity is the 
mark; the bottom panel has <; — r as the mark. Solid curves 
show the halo model prediction, computed by inserting the 
mass dependence of the mean marks for centrals and satel- 
lites into the mark correlation formalism of Appendix [B] 
The luminosity and color mark correlations are significantly 
above unity, which clearly shows that in denser environ- 
ments we expect the luminosities of galaxies to be brighter 
(top panel) and the colors to be redder (bottom panel). The 
mark correlations also clearly show the transition from the 
1-halo term to the 2-halo term at r ~ Mpc//i, which is the 
virial radius of the most massive halos at z ~ 0. The tran- 
sition is more pronounced than in the traditional unmarked 
correlation function £,{r). 

There is reasonably good agreement between the halo 
model calculation and the mocks for both the luminosity and 
color mark correlation functions; the unmarked correlation 
functions ^(r) agree extremely well, so they are not shown. 
Both panels in the figure show a similar but small discrep- 
ancy at similar scales, approximately where the 1 halo-2- 
halo term transition occurs. Although statistically signifi- 
cant, this discrepancy is small compared to the significance 
with which the signal itself differs from unity: the halo model 
calculations are qualitatively, if not quantitatively, correct 
across a wide range in scales. The agreement between the 
model and the mocks is encouraging; it suggests that much 
of the environmental dependence of galaxy color arises from 
the environmental dependence of host halo mass. 



4 COMPARISON WITH SDSS 

In this section we compare color mark projected correla- 
tion functions predicted by the halo model to measure- 
ments in the SDSS (York et al. 2000). We use two volume- 
limited large-scale structure samples built from the NYU 
Value-Added Galaxy Catalog (Blanton et al. 20056) from 
SDSS DR4plus, which is a subset of SDSS Data Release 5 
(Adelman-McCarthy et al. 2007). We fc-correct the magni- 
tudes to 2 — 0.1 using the kcorrect v4_l code of Blan- 
ton & Roweis (2007); the magnitudes are also corrected for 
passive evolution. Our fainter catalog has limits —23.5 < 
°-^M^ < -19.5, 0.017 < z < 0.082; it consists of 78356 
galaxies with mean density figai = 0.01061 (/i~^Mpc)''. Our 
brighter catalog has -23.5 < Mr < -20.5 and 0.019 < 
z < 0.125, and contains 73468 galaxies with mean density 
figai = 0.00280 (/i~^Mpc)^. These luminosity thresholds ap- 
proximately correspond to Mr < M* + 1 and Mr < M* , 
where M* is the break in the Schechter function fit to the 
r-band luminosity function (Blanton et al. 2003). 

For the measured correlation functions and jack-knife 
errors, which require random catalogs and jack-knife sub- 
catalogs, we use the hierarchical pixel scheme SDSSPix 0, 
which characterizes the survey geometry, including edges 
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Figure 5. Distribution of Petrosian (blue histogram) and model 
(red dashed histogram) g — r colors in the Mr < —19.5 volume- 
limited catalog. 



and holes from missing fields and areas near bright stars. 
This same scheme has been used for other clustering analy- 
ses (Scranton et al. 2005, Hansen et al. 2007) and for lensing 
analyses (Sheldon et al. 2007). 

Figure [5] shows the distribution of g — r colors in our 
fainter {Mr < —19.5) catalog. The distributions of Petrosian 
and model colors are similar, although the model colors are 
slightly redder. The mean Petrosian color is 0.796, whereas 
the mean model color is 0.825. This is not unexpected — 
galaxies have color gradients, and model colors measure the 
color on smaller scales. These mean values are 0.850 and 
0.885 in the brighter catalog {Mr < -20.5). The blue frac- 
tions of the Petrosian colors of the fainter and brighter cat- 
alogs are, respectively, 44% and 37% using the fixed color- 
magnitude cut (equation |3| and 47% and 43% using the 
double-Gaussian model (equations IH3|I . and they are ~ 6% 
lower for the model colors. 

We now present our color mark correlation functions. 
In practice, in order to obviate redshift-space calculations 
in the halo model and redshift distortions in the data, we 
use the projected two-point correlation function 



Wp{rp) = / dr^{rp,n) = 2 



dr ■ 



r^{r) 



(18) 



where r = yV^J^ + yr^, rp and n are the galaxy separations 
perpendicular and parallel to the line of sight, and we inte- 
grate up to line-of-sight separations of tt = 40Mpc//i. We 
estimate ^{vp, n) using the Landy & Szalay (1993) estimator 



^{■rp,Tv) 



DD - 2DR + RR 
RR ' 



(19) 



where DD, DR, and RR are the normalized counts of data- 
data, data-random, and random-random pairs at each sepa- 
ration bin. We then define the marked projected correlation 
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function 



Mp(rp) 



1 + Wp(rp)/rp 
1 + 'Wp{rp)/rp 



(20) 



which makes Mp{rp) ~ M(r) on scales larger than a few 
Mpc. For the SDSS measurements, we used random catalogs 
with 10 times as many points as in the data; the error bars 
show the variance of the measurements of 30 jack-knife sub- 
catalogs. 

Figures [6] and [7] compare the color marked correlation 
functions for the Mr < —19.5 and Mr < —20.5 catalogs 
with our predictions. The solid and open points show the 
measurements for Petrosian and model colors. The color 
mark signals in the bottom panels are stronger for Pet- 
rosian colors, at the la level, for both luminosity thresholds 
Mr < —19.5 and Mr < —20.5. Evidently, the environmen- 
tal dependence of Petrosian colors is stronger than that of 
model colors. However, this is probably due to the fact that 
the red and blue peaks are slightly more displaced from one 
another for Petrosian rather than model colors. 

The correlation function of galaxies split by color is the 
measurement that has traditionally been used to show the 
environmental dependence of color {e.g., Zehavi et al. 2005; 
Tinker et al. 2007). The top panel in Figure [7] shows such 
measurements for galaxies redder and bluer than the color 
cut given by equation Q. Open squares and triangles are 
for measurements in the SDSS and in a mock catalog con- 
structed as described in the previous section. (The SDSS 
galaxies were split by their Petrosian colors; the measure- 
ment is virtually the same when they are split by model 
colors.) 

The mock catalog is at 2 = 0, whereas the SDSS mea- 
surements (and corresponding theory curves) are at 2: ~ 0.1. 
Therefore, to compare the clustering of red and blue galaxies 
in our mock with the measurements, we measure the ratio of 
Wp.rod to Wp,aii, and Wp,biuc to uip^aii in our 2 = mock. We 
then assume that this ratio would be the same at 2 = 0.1 
as it is at 2 = 0; the triangles show the result of applying 
this ratio to ujp^aii ai z — 0.1 (i.e., the filled circles) — they 
represent how the clustering of red and blue galaxies differ 
from the full sample in our mock. The agreement between 
the clustering of the mock galaxies and SDSS galaxies is very 
good, indicating that our model reproduces these traditional 
measurements of color dependent clustering very well. 

Because mark statistics do not require binning of the 
dataset into coarse bins in color, or coarse bins in density, 
the mark correlation functions shown in the lower panels 
of the figures contain significantly more information about 
environmental correlations than more traditional measures. 
They allow the mark to take a continuous range in values, 
and they yield a clear, quantitative estimate of the corre- 
lation between the mark and the environment at a given 
scale. Mark statistics are also sensitive to the distribution of 
the marks: for example, for the fainter luminosity threshold 
{Mr < —19.5), the color marks have a wider distribution 
than for the brighter threshold, so some galaxies have colors 
farther from the mean mark. Because these outliers also tend 
to be in more extreme environments, the result is a stronger 
mark correlation {e.g., Mg-r{rp = lOO/i"'^ kpc) ~ 1.23 
vs. 1.15). Notice also that the mark correlation functions 
are more curved for the fainter luminosity threshold, with 
a more distinct transition between the 1-halo and 2-halo 
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Figure 6. Projected two-point correlation function and g — r 
color mark correlation function for Mr < —19.5. Points show 
SDSS measurements for Petrosian (solid points) and model colors 
(open points), with jack-knife errors. Solid curves show the halo- 
model prediction when satellite galaxies can be drawn from either 
the red or the blue sequences (equations ; dashed curve shows 
the prediction if satellites are drawn from the red sequence only 
(equation [TJ . 
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Figure 7. Projected two-point correlation function and g—r color 
mark correlation function, like Figure \6\ but for Mr < —20.5. 
In the upper panel, the correlation functions for galaxies redder 
and bluer than the color cut (equation |4]| are also shown, for the 
SDSS galaxies (open squares) and mock catalog galaxies (open 
triangles). For clarity, error bars are only shown for the full SDSS 
catalog. 
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terms. This is because there are more satellite galaxies in 
the fainter sample, and the mark clustering is more sensi- 
tive to their spatial distribution within halos. 

Note that the model in which no satellites come from 
the blue sequence (dashed curve) produces too strong a sig- 
nal: galaxies in dense environments are too red. Since most 
of these are satellites, this model places too many red satel- 
lites in massive halos. The difference between the two models 
is greater for the faint luminosity threshold simply because 
there are more faint satellites, more of whose colors should 
be drawn from the blue sequence. The model in which satel- 
lites come from a mix of the two sequences, though they are 
increasingly red at large luminosities (equation [7]-[9} is in 
good agreement with the measurements on all scales where 
the statistic is reliably measured. This suggests that this 
model of the colors of central and satellite galaxies is a rea- 
sonable one. The good agreement between our model and the 
data also indicates that the correlation between halo mass 
and environment is the primary driver of the environmental 
dependence of galaxy color. 



5 DISCUSSION 

We have developed and tested a simple model for several ob- 
served correlations between color and environment on scales 
of lOO/i^^ kpc < Tp < 30h~^ Mpc. Our model is built upon 
the model of luminosity mark clustering of Skibba et al. 
(2006), in which the luminosity-dependent halo occupation 
distribution was constrained by the observed luminosity- 
dependent correlation functions and galaxy number den- 
sities in the SDSS. The model presented here has added 
constraints from the bimodal distribution of the colors of 
SDSS galaxies as a function of luminosity. We make two 
assumptions: (i) that the bimodality of the color distribu- 
tion at fixed luminosity is independent of halo mass, and (ii) 
that satellite galaxies tend to follow a particular sequence in 
the color-magnitude diagram, one that approaches the red 
sequence with increasing luminosity (equation [7)l . Alterna- 
tively, this assumption can be phrased as specifying how the 
fraction of satellites which are drawn from the red and blue 
sequences depends on luminosity (equation (Qjl . 

One virtue of our model is the ease with which it allows 
one to include color information into mock catalogs. Adding 
colors to a code which successfully reproduces luminosity 
dependent clustering requires just four simple lines of code 
— two for centrals and two for satellites (Section 13. If) . This 
is far more efficient than 'brute-force' approaches which are 
based on fitting HODs to fine bins in L and color, or others 
which are based on using observed correlations between color 
and local density. Since bimodality is also observed at z = 1, 
it would be interesting to see if our approach is similarly 
successful at interpreting the measurements of Coil et al. 
(2008) in the DEEP2 sample. 

Realistic colors are necessary for providing realistic 
training sets for galaxy group- and cluster-finding algo- 
rithms, and a number of groups are currently developing 
such mock catalogs. So we think it is worth emphasizing 
that our approach can be applied to any mock catalog which 
produces the correct luminosity- dependence of clustering. 
Thus, although we phrased our discussion in terms of an 



HOD-based mock, mocks based on CLFs or SHAMs could 
also use our method for generating colors. 

In particular, cluster-finding algorithms that exploit in- 
formation about brightest cluster galaxies (BCGs) , or galax- 
ies' positions from the red sequence, or galaxies' redshift- 
distorted positions, or the multiplicity function or total lu- 
minosity or stellar mass of groups, could all be tested with 
mock catalogs constructed with the approach described in 
this paper. We will be happy to provide our mock catalogs 
to those interested, upon request. 

More generally, we feel that the simplicity of our ap- 
proach makes it an attractive way to begin to include the 
entire SED into the halo model description, and hence 
into mock catalogs. Specifically, starting from our successful 
model for adding g — r given L, the next step might be to 
add, say, u — r, given g — r and L - again assuming that the 
distribution p{u — r\g — r, L) is independent of halo mass. 
This is also attractive because we have shown that such an 
approach is easily described using the language of the halo 
model — Section 13.31 provides a halo-model description of 
the color-mark correlation function. This facilitates the use 
of mark statistics in testing our hypothesis that the bimodal 
color distribution is independent of halo mass. 

Comparison of our mark correlation measurements with 
measurements in our mock catalogs and with our halo model 
calculations (Figures |6] and [7]) suggest that if the bimodal 
color distribution is independent of halo mass, then at least 
some of the noncentral/satellite galaxies in a halo must be 
drawn from the blue sequence — this fraction of blue satel- 
lites must be larger at low luminosities. This is one of the 
key results of our paper. 

If satellites lie on the red sequence because their star 
formation has been quenched by processes such as 'strangu- 
lation' {e.g., Weinmann et al. 2006), then our results suggest 
that quenching is still on-going at lower luminosities. Such 
processes are expected to modify the colors and star forma- 
tion rates of satellite galaxies, but not their morphologies; 
we investigate this further in a subsequent paper by mea- 
suring morphology mark correlations in the SDSS Galaxy 
Zoo catalog. We caution, however, that we, like all previous 
halo model analyses, have ignored the fact that inclination 
can affect the observed galaxy properties - luminosities and 
colors in the present context. Corrections for inclination- 
related effects are available in the literature (Giovanelli et 
al. 1995; TuUy et al. 1998; Sheth et al. 2003), and they are 
not negligible. Recent work on this by Mailer et al. (2008), 
which appeared while our work was being refereed, provides 
relatively straightforward corrections which may be reason- 
ably accurate. For this reason, our work should be viewed 
as attempting a halo-model description of the observed col- 
ors, rather than providing a truly physical picture of the 
intrinsic (face-on?) colors. Of course, if the luminosities and 
colors had been corrected for inclination effects, we expect 
our analysis to also yield results which are closer to the true 
physical picture. But because we have not yet included these 
corrections, we believe that statements about the physics of 
'strangulation', especially at low luminosities, are prema- 
ture. 

We expect our model to be in good agreement with the 
findings of Zehavi et al. (2005), who analyzed volume-limited 
SDSS samples after dividing galaxies into two bins in color. 
They used a slightly redder color cut than did we to pro- 
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duce the measurements shown in the top panel of Figure [T] 
They found that the fraction of central galaxies which lay 
blueward of this cut increased as L decreased; that there 
were no faint blue satellites; and that, although there are 
blue satellites at intermediate and high L, they were about 
a factor of five less common than red satellites in halos of the 
same mass. Our model is in qualitative agreement, with the 
mean central and satellite galaxy colors increasing with both 
luminosity and halo mass. Zehavi et al. inferred from their 
results that the majority of bright galaxies are red centrals of 
massive halos, and that faint red galaxies are predominantly 
satellites in massive halos. This is consistent with Swanson 
et al. (2007), who found that both luminous and faint red 
galaxies are more strongly clustered than moderately bright 
red galaxies. We reach a similar conclusion, although not all 
faint red galaxies are satellites in massive systems: some are 
centrals in underdense environments. 

We also expect our model to be in qualitative agree- 
ment with the findings of Blanton & Berlind (2007). These 
authors defined blue galaxies as those lying blueward of 
g - r = 0.8 - Om{Mr + 20) (our equation [J). They then 
found that the color magnitude relation for galaxies in lu- 
minous groups tended to have /blue decreasing with group 
luminosity, but that the red and blue sequences were other- 
wise approximately independent of group luminosity. They 
phrased their findings as showing that the color magnitude 
relation depends on group luminosity, presumably because 
they wished to draw attention to the dependence of /biuc 
on group luminosity. In light of the discussion above, we 
think this is slightly misleading. The red and blue sequences 
in our model are independent of group properties by con- 
struction. In our model, the decrease of the blue fraction 
in luminous groups is simply a consequence of the assump- 
tion that satellites tend to be drawn from the red rather 
than the blue sequence. This happens because more lumi- 
nous groups will tend to have more satellites and redder 
centrals (because central galaxy luminosity increases with 
halo mass which is, in turn, strongly correlated with total 
luminosity, and luminous galaxies are red). Since our model 
has mainly red satellites, the red fraction is larger in more 
luminous groups. Skibba (2008) describes the results of a 
direct comparison of our model predictions with the colors 
of centrals and satellites in group catalogs. 

In our model, all environmental correlations arise from 
the fact that massive halos tend to reside in denser envi- 
ronments (Mo & White 1996; Sheth & Tormen 2002). Re- 
cent studies of the environmental dependence of halo as- 
sembly have shown that halo properties such as formation 
time and concentration are correlated with the environment 
at fixed halo mass (Sheth & Tormen 2004; Gao, Springel, 
White 2005; Wechsler et al. 2006; Croton, Gao, White 2007; 
Wetzel et al. 2007, Keselman & Nusser 2007, Zu et al. 2007). 
They have found that at fixed mass, halos in dense environ- 
ments form at slightly earlier times than halos in less dense 
environments. The success of our model suggests that such 
'assembly bias' effects are not the primary drivers of the en- 
vironmental dependence of galaxy colors in the real universe, 
thus extending previous conclusions about the insignificance 
of assembly bias on galaxy luminosities (Skibba et al. 2006; 
Abbas & Sheth 2006, 2007; Blanton & Berlind 2007; Tin- 
ker et al. 2007), at least for the relatively bright galaxies 
in the SDSS. Further tests, such as analyses of luminosity 



and color mark statistics of catalogs constructed from semi- 
analytic models with known assembly bias, would shed more 
light on these issues, and are the subject of a subsequent pa- 
per. 

Our model does not include the galactic 'conformity' 
reported by Weinmann et al. (2006), in which bluer centrals 
are likely to be surrounded by bluer satellites, at fixed halo 
mass. Including this effect is the subject of work in progress. 
The main quantitative predictions of our model, such as the 
mean central and satellite colors as a function of mass, and 
the correlations between color and environment, are not ex- 
pected to be significantly affected by this phenomenon, how- 
ever. Our model also does not include color gradients within 
halos — it has long been known that satellite galaxies near 
halo centers tend to be redder than in the outskirts. In this 
case, satellite color marks depend on both the host halo mass 
and on their distance from the halo center. Halo model anal- 
yses show that this should only matter on small scales (see 
discussion of Fig. 4 in Sheth et al. 2001; Scranton 2002); 
for galaxy populations with many satellite galaxies, the 1- 
halo term of the color mark signal is expected to be slightly 
higher (Sheth 2005). Skibba (2008) incorporates this effect, 
and does find such an increase at small scales. 

Finally, it is worth emphasizing that mark statistics are 
sensitive indicators of the correlations between galaxy prop- 
erties and the environment, and as such are powerful tools 
for constraining galaxy formation models. An analysis of 
marked correlation with star formation rate marks in the 
SDSS and the Millennium Simulation is the subject of work 
in progress. The halo-model description of marked statistics, 
based on the luminosity dependence of galaxy clustering, 
also has many applications. In a forthcoming paper (Skibba 
& Sheth 2008), we present a model of stellar mass mark cor- 
relations and analyze them with SDSS measurements anal- 
ogous to the color mark correlations presented here. 
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APPENDIX A: EXPLICIT EXAMPLES OF 
DIFFERENT HODS 

The main text outlines our model; actual implementation of 
it depends on the form of the luminosity-based HOD. These 
are of two types - either the relation between halo mass and 
central galaxy luminosity is monotonic and deterministic, or 
there is some scatter. We use the parametrization of Zehavi 
et al. (2005) to illustrate the former case, and that of Zheng 
et al. (2007) to illustrate the latter. The results described in 
the main text are not particularly sensitive to this choice, 
although the plots we are based on HODs in which there is 
scatter. 
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Al No scatter between Lccn and halo mass 

To evaluate risat/jicon, suppose that the relation between 
halo mass and central luminosity is deterministic (i.e., there 
is no scatter around the Leon — rn relation). Then 

nccn{L) — {dniL/dL) {dn/dm)mL (Al) 

where halo model interpretations of SDSS galaxy clustering 
suggest that 

L/1.12 



rriL 







; exp 



1010/1-2L 



(A2) 



(Zehavi et al. 2005; Skibba et al. 2006), and the halo mass 
function dn/dm is described in Sheth & Tormen (1999). 

The number density of satellite galaxies which have lu- 
minosity L, whatever the mass of the parent halo, is given 
by differentiating 



nsat(> L) = / dm -— Nsat{> L\m) 
with respect to L. Zehavi et al. (2005) show that 

iV,at(> L\m) 



2'imL , 

where ttil is given by the expression above, and 
aL « 1.16-0.1(M^+20)+0.1e"^'^'"'-+^^-^''' 
is a weakly increasing function of L. Thus, 
L/1.12 



(A3) 



(A4) 



(A5) 



nsat(> L) = 



exp 



1O1O/1-2L0 
dn 



1 



dm 



dm 



i/23 



so 

nsat 



nccn(L) 



+ CtL 



1O12/1-1M0 

'^sat(> L) 



(A6) 



23"i ' " (dn/dlnm)„^ 
d In aL dlnL/dlnmL 
d\nL {dn/d\nm)mj^ 
dn ( m/2?, 



dm ■ 



In 



71/23 



dm \ mL 

and the fraction of objects which are centrals is 

J /j\ _ n,ccn(L) _ 1 

Wccn (L) + risat (L) 1 + Usat (L) / Wccn (L) ' 

To see what these expressions imply, suppose that ol were 
independent of L. Then 



(A7) 



(A8) 



nsat(L) nsat(>L) 

= a 

nccn(L) {dn/d\nm)m 



+ 



1 

23° 



(A9) 



and 

dLC{L) [-^^j ■ (AlO) 

In this case, the mean satellite color is independent 
of halo mass. If a = 1 (not far off from its actual 
value) and m{dn/dm) oc exp(— m/m,)/m, for some 
fiducial value of m* (halos more massive than m* ~ 
lO"/i-^M0 are indeed exponentially rare), then nsat(> 
L) = exp(— mL/mt)/(23mL) making nsat(L)/nccn(L) = 
(m*/mL + l)/23. This ratio decreases as mL increases — 
as L increases, the ratio of satellites to centrals decreases, 
and the fraction of centrals increases. 



In the analyses which follow, we use the actual halo 
model values of these quantities rather than these approxi- 
mations. A reasonable fit to the actual halo model values is 
given by 



Wsat(L) 
ncGn(L) 



0.35 [2 - erfc [o.6(AL. + 20.5)] ] (All) 



This ratio tends to 0.7 at small luminosities, making the 
fraction of galaxies which are centrals at L ^ W^^ Lq 
about 3/5 (c/. equation IA8|) . consistent with the satellite 
fraction /sat(L) of van den Bosch et al. (2007a). 



A2 Stochasticity in the Lc 



m relation 



Zheng et al. (2007) allow for stochasticity in the relation be- 
tween halo mass and central galaxy luminosity. They assume 
that 



P(l0g Lccn I M) 

and then set 

(A^ccnlM) 

and 



1 



27raio: 



-exp 



■gL 



1 + erf 



Iog(Lccn/{LccnlAf))] 



(iVsatlM) = 



log(Af/Afmin) 
flog A/ 



M- Mq 

Ml 



The Poisson model for satellite counts sets 

{Afsat(iVsat - 1)1M) = {N,^t\Mf. 



(A12) 
(A13) 

(A14) 
(A15) 



Their Table 1 shows how all of the parameters in this HOD 
vary with SDSS r-band luminosity. We have found that these 
scalings with Lr are well approximated by 



Mn,in 



1O11-95M0//1 
O" logAf 



exp 



L 



lO«'-OL0/^2 



(A16) 



0.26if > -20.5 , - 

0.385 - 0.25 {Mr + 21), otherwise ' 



Mo 



17Mn,in 

L 



lO"-75M0//i 
a 



109-9Lq//i2 

1 - 0.07 {Mr + IS.i 



(A18) 
(A19) 
(A20) 



As in Zehavi et al. (2005), the value of Mi/Mmin, which 
determines the critical mass above which halos typically host 
at least one satellite galaxy, is approximately independent 
of luminosity, while the (A'sat) slope a, which characterizes 
the mass dependence of the efficiency of galaxy formation, 
increases with luminosity. The two new HOD parameters 
are criogM and Mq. They are not constrained well and their 
uncertainties are large (see Zheng et al. for details), but our 
correlation functions and color mark correlation functions 
are not very sensitive to their exact values. 

For the two luminosity thresholds discussed in the 
main text, AL- < —19.5 and Mr < —20.5, the parame- 
ters above are ALnin ~ 5.8 x 10^^ /i~^Mpc and ALnin ~ 
2.2 X 10^^ /i"^Mpc, and the effective value of A/i /ALnin ~ 20, 
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approximately independent of luminosity, is similar to the 
factor of 23 in the Zehavi et al. HOD. 

For our purposes, the main difference with this HOD 
model is the scatter in luminosity at fixed mass. We first 
discuss how to construct a mock catalog that includes this 
scatter. We then explain how our model of the color mark 
is modified. 

To account for the scatter between Leon and Afhaio in the 
mock catalogs, we do not simply select the subset of halos 
in the simulation which have M > ALmin(imin), as we do in 
the case of a sharp threshold (e.g. Section [3.1^ . Instead, we 
generate uniformly distributed random numbers u between 
and 1 for each halo of mass M. Then we keep the halo if 
u < (A'^conlM) (equation lA13|) . As a result, only half of the 
halos with AI ~ Mmin are kept, as are quite a few halos with 
M < Mmin. Larger values of criogA/ increase the range of halo 
masses around Mmin and increase the total number of halos 
because the abundance of halos increases with decreasing 
mass. 

Our halo model of the color mark is also modified 
by the scatter between luminosity and mass, and hence 
{Ncen\M, Luiin) IS uo louger a step function. The central 
galaxy color mark, described in Section [2] is slightly more 
complicated. The mean central galaxy color as a function 
of luminosity {c|L)ccn (equation [T5J depends on the num- 
ber density of central galaxies as a function of luminosity 
ncen{L) (equation I Al|) . which now includes an integral: 



/ dn 



cLMl 
dL 



(A21) 



+ 



Then the central galaxy color mark, which is used in the 
color mark correlation functions, is also an integral (equa- 
tion [161): 



(c|M)c 



/ 



dLPcon(L|M) (c|L)c 



(A22) 



The model of the color mark correlation functions, described 
in Appendix|Bl is also modified. However, we reiterate that, 
in general, the correlation functions and color mark corre- 
lation functions are not sensitive to the exact amount of 
scatter in mass at fixed luminosity. 



APPENDIX B: A HALO-MODEL OF COLOR 
MARK CORRELATIONS 

We perform our halo model calculations in Fourier space. 
The two-point correlation function is the Fourier transform 
of the power spectrum 

dk k^P{k) sin kr 
T 



I- 



27r2 



kr 



(Bl) 



In the halo model, P{k) is written as the sum of two terms: 
one that arises from galaxies within the same halo and dom- 
inates on small scales (the 1-halo term) , and the other from 
galaxies in different halos which dominates on larger scales 
(the 2-halo term). That is. 



P{k) ^ Pih{k) + Pihik), 



(B2) 



where, 
Pih{k) 



2{iV,at|M) Mgal(fc|M) 



+ 



(iVsat(iVsat-l)IM) ^tgal(fc|M)=' 



P2h{k) 



1 + (iV,at|M)Ugal(fc|M) 
Wgal 



(B3) 
(B4) 



h[M) 



Plin(fc), 



where the number density of galaxies rigai is (c/., eg. I A3 

dn(m 



ngal 



— I dm ■ 



dm 



(iVccnlm) [l + (Afsatlm)! 



(B5) 



and iigai(A:jAf) is the Fourier transform of the galaxy den- 
sity profile. It is standard to assume this has the same form 
as the dark matter, so we use the form for u given by 
Scoccimarro et al.(2001). The distribution psat(A^sat) is ex- 
pected to be well-approximated by a Poisson distribution 
(e.g., Kravtsov et al. 2004; Yang et al. 2008), so we set 
{A^sat(iVsat - l)|Af) = {N,^t\Mf. The two parts of the 1- 
halo term in equation (|B3P can be thought of as the 'center- 
satellite term' and the 'satellite-satellite term'. 

To describe the effect of weighting each galaxy, we use 
W{k) to denote the Fourier transform of the weighted cor- 
relation function. Like the power spectrum, we write this as 
the sum of 1- and 2-halo terms: W{k) = Wih{k) + W2hik). 
Since central and satellite galaxies have different properties, 
we weight central and satellite galaxies separately by their 
mean mass-dependent marks: {c|m)ccn and {c|m)sat (Sec- 
tion [2}. Following Sheth (2005), we write 



/ 



2Cccn(M) (CsatlAf, Lmin) (iVsatjAf) Ugal(fc|A/) 



gal 



{N,^t\Mf {c,^t\M, Lminf u%i(k\M) 



W2h{k) 

Plin(fc) 



/ 



dM 

Cccn(A-f) + (AfsatlA/) {c,at 



A/) b{M) 



(B6) 
(B7) 



M, Lmin) %al(fcjA'/) 



ngal c 



where we normalize by the mean color mark 
dn(A'/) 



dM- 



dM 



(iVconlA/) 



^ CccnjM) + (iVsat|A/) (Csat j A/, Lmin) ^gg^ 
ngal 



